Binaural audio generation via multi-task learning

نویسندگان

چکیده

We present a learning-based approach for generating binaural audio from mono using multi-task learning. Our formulation leverages additional information two related tasks: the generation task and flipped classification task. learning model extracts spatialization features visual input, predicts left right channels, judges whether channels are flipped. First, we extract ResNet video frames. Next, perform separate subnetworks based on features. method optimizes overall loss weighted sum of losses tasks. train evaluate our FAIR-Play dataset YouTube-ASMR dataset. quantitative qualitative evaluations to demonstrate benefits over prior techniques.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Image Cosegmentation via Multi-task Learning

Image segmentation has been studied in computer vision for many years and yet it remains a challenging task. One major difficulty arises from the diversity of the foreground, which often results in ambiguity of backgroundforeground separation, especially when prior knowledge is missing. To overcome this difficulty, cosegmentation methods were proposed, where a set of images sharing some common ...

متن کامل

Multi-Task Learning via Conic Programming

When we have several related tasks, solving them simultaneously is shown to be more effective than solving them individually. This approach is called multi-task learning (MTL) and has been studied extensively. Existing approaches to MTL often treat all the tasks as uniformly related to each other and the relatedness of the tasks is controlled globally. For this reason, the existing methods can ...

متن کامل

Interest Inference via Structure-Constrained Multi-Source Multi-Task Learning

User interest inference from social networks is a fundamental problem to many applications. It usually exhibits dual-heterogeneities: a user’s interests are complementarily and comprehensively reflected by multiple social networks; interests are inter-correlated in a nonuniform way rather than independent to each other. Although great success has been achieved by previous approaches, few of the...

متن کامل

Semantic Segmentation via Multi-task, Multi-domain Learning

We present an approach that leverages multiple datasets possibly annotated using different classes to improve the semantic segmentation accuracy on each individual dataset. We propose a new selective loss function that can be integrated into deep networks to exploit training data coming from multiple datasets with possibly different tasks (e.g., different label-sets). We show how the gradient-r...

متن کامل

Multi-task Learning via Non-sparse Multiple Kernel Learning

In object classification tasks from digital photographs, multiple categories are considered for annotation. Some of these visual concepts may have semantic relations and can appear simultaneously in images. Although taxonomical relations and co-occurrence structures between object categories have been studied, it is not easy to use such information to enhance performance of object classificatio...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: ACM Transactions on Graphics

سال: 2021

ISSN: ['0730-0301', '1557-7368']

DOI: https://doi.org/10.1145/3478513.3480560